Find broken links on web sites
Description
Xenu's
Link Sleuth (TM) checks Web sites for broken links.
Link verification is done on "normal" links, images, frames, plug-ins,
backgrounds, local image maps, style sheets, scripts and java applets.
It displays a continously updated list of URLs which you can sort by different
criteria. A report can be produced at any time.
Additional features:
-
Simple, no-frills user-interface
-
Can re-check broken links (useful for temporary network errors)
-
Simple report format, can also be e-mailed
-
Executable file less than 500K
-
Supports SSL websites ("https:// ")
-
Partial testing of ftp and gopher sites
-
Detects and reports redirected URLs
-
Site Map
Download
By downloading you are acknowledging
that:
-
You will personally check the software for viruses before starting it (I
do the same with software I download with Norton AntiVirus)
-
You will not make me responsible for damages (lost time, crashed computer,
etc)
System requirement: Microsoft Windows 95/98/ME/NT/2000/XP, WININET.DLL
required (is usually included). No, it won't work on Windows 3.11, not
even with Win32s. No, I won't make a Java, MacOS, Linux, Beos, Palm or
C64 version. Don't even ask! (However I have been told that it will run
faultlessly under Red Hat 8 using wine
:-))
Attention CompuServe users: The old version of RPAWINET.DLL
(e.g. from 18.9.1996) that came with the WinCIM 3.0 CD-ROM is deadly -
go
get the bugfix from CompuServe.
Ok, I have read all that, I want
to download! (current version: 1.2f from August 6th, 2004)
Getting started:
Unzip it and install it wherever you want. To
check a site, click the toolbar icon on the left and enter a WWW address.
If the address finishes with a directory name, don't forget to put a /
at the end or you will possibly get the whole parent directory spidered.
Incorrect:
http://www.host.com/~user
Correct:
http://www.host.com/~user/
You can also click the "browse" button to check
a local HTML file. If you do not already use IE for browsing and are sitting
behind a company firewall, don't forget to configure your
proxy before you start. If you are using a personal firewall (like ZoneAlarm or Outpost) you must enable Microsoft Internet Explorer by starting it, entering an URL and then
"allowing" the application. To find out what the software can do, simply
try out the menu choices, the toolbar and the right mouse key. Or read
this small third-party
manual, a big third-party manual with many pictures, a third-party report (How
I check over 6,000 links every seven to ten days), or a german
description and another.
Good luck! If you find the software useful, please
click
here.
Test everything. Hold on to the
good.
(1 Thessalonians 5:21)
|
Join the Update
Announcements mailing list at Yahoo Groups! To subscribe, send an empty
e-mail to linksleuthupdates-subscribe@yahoogroups.com.
You can also join the user group by sending an e-mail to xenu-usergroup-subscribe@yahoogroups.com.
If you like to use a button on your WWW page, link to this page with
this button: ![[Linkcheck by Xenu!]](xenu_button.gif)
The address of this web page is http://home.snafu.de/tilman/xenulink.html
Frequently Asked Questions (FAQ)
1. Who is Xenu?
See here.
2. Is Xenu's Link Sleuth (TM) better than WebAnalyzer?
Yes and No. Xenu's Link Sleuth (TM) does not have
the graphic capabilities of WebAnalyzer 2.0 ("Wavefront view"). But here
are some of the advantages of Xenu's Link Sleuth (TM):
-
It is free
-
Simple user-interface
-
Better error reports (not just "network error")
-
"Save" works also while the software is busy
-
The "broken links view" shows only broken links; In WebAnalyzer you'd have
to press the button again and again as the window fills with crap.
-
While Xenu does not offer an "update" facility (which doesn't work anyway),
it has a "recheck broken links" function that works fine.
-
It is small, written by one person with 5 years experience of Windows development
and 15 years of professional experience as software developer. This means
that bugs will be corrected quickly. This is a matter of honour.
-
The report can be viewed easily, even when you have long URLs.
-
Uses much less disk space for intermediate files, executable file much
smaller
-
Loading of saved files much faster (WebAnalyzer loses time by displaying
the extra graphics)
-
Supports SSL websites ("https:// ")
-
Partial testing of ftp and gopher sites
-
Search for local orphan files
-
Special handling of redirected URLs
-
Site Map
-
Randomization of checking order, means less concurrent requests on a single
server
Xenu sez: check your website both with
this product and with another product (Linkbot,
InfoLink,
LinkScan,
LinkAlarm and Web
Link Validator offer trial versions - WebAnalyzer is no longer available
since February 2002 and hasn't been updated for years), and decide what
you need and what you are willing to pay.
3. Is Xenu's Link Sleuth (TM) better than Net
Mechanic?
Years ago, Net Mechanic was a free WWW based service, and was useful to
check
very small web sites. It is no longer free. The free trial
is too small, and reports about all links, instead just the broken ones.
4. Can I support the author financially?
No need to. If you feel the software is useful, you may donate money to
causes I support.
-
AFF is a nonprofit, tax-exempt
research center and educational organization founded in 1979. AFF's mission
is to study psychological manipulation and cultic groups, to educate the
public and professionals, and to assist those who have been adversely affected
by a cult-related experience. I suggest a donation of $20 for individuals
and $200 for corporations. In the US, your donation can be deducted from
your income. (AFF does not endorse this site in any way, did not develop
this software, does not sell this software, and the use of this software
does not depend whether or not you make a donation.)
Germans can make a tax deductible donation to the Dialog
Zentrum Berlin e.V., Konto-Nr. 1551390051, Bank für Kirche und
Diakonie BLZ 35060190.
Or visit the Xenu bookstore.
5. Why does Xenu's Link Sleuth (TM) report http://www.site.com/../page/index.html
as broken?
The key is the "../" part. It means
you have e.g. a top level page that links to a page in a directory above,
which doesn't exist. It is true that Mozilla will not have any problems
with such a page; but I am less tolerant.
6. How can I configure a proxy?
You can configure a proxy in the control application of Windows. Double-Click
on the "internet" symbol, then click on the "card" of the dialog box that
is named "Connection". You will need a proxy if you are sitting "behind
a firewall". This is usually so in big corporate networks.
7. Why does Xenu's Link Sleuth(TM) report an URL with
a space in it?
Either because you do have a space in the URL, or because you have a carriage
return / newline in it. Although Mozilla tolerates this, I do not.
8. I use Mozilla 3.0 Gold and can't get rid of file:
URLs for images. What can I do?
Re-edit the page, double-click on the picture, remove file:
from the picture location and take care to uncheck "copy image to document's
location" in the "properties" dialog box (at the bottom left) before you
save and exit the dialog box.
9. What is the maximum number of websites that can be checked?
There is no maximum. It is limited by the memory on your computer.
10. Can the software check my site locally?
Since september 1998 (1.0n), you can do so without a local web server (your
address would then be http://127.0.0.1).
Use the "Browse" button in the "New" dialog box.
The results will not always be the same as a "remote" check:
-
Sometimes you'll get "error 3". It happens because the WININET.DLL is unable
to handle directories, i.e. links that end with "/". You can avoid this
by linking to the actual "main file", usually
index.html or default.html.
That your browser can handle local directories and display them nicely,
is because he does additional work, which I do not.
-
Mixups of higher/lower case characters in links won't be found, since Windows
does not make a difference. But UNIX does!
-
The main reason that you still need to make occasional "remote" checks
is because you might have forgotten to upload your files to your WWW server.
A user of IE 4.0 reported that when not online, the software checks every
"remote" URL like a local file. This is a problem of the newer version
of the WININET.DLL; the version with IE 3.0 reports "no connection" or
"no such host" instead, which is more logical.
11. Does it work on Windows NT 3.51?
One user said it worked fine after he copied a version of WININET.DLL from
a Windows 95 system standing nearby, and put it into the directory where
Xenu's Link Sleuth(TM) was installed.
12. How is it so damn fast?
Because it uses a (possibly
patented, see patents here
and here)
technique known as preemptive multithreading. It means that the
link checking software retrieves several web pages at the same time; the
competition uses the same technique. The maximum count of threads is initially
set to 30, but you can configure it to any number between 1 and 100. A
number that is too high might result in failed connections or in timeouts,
which means you will have to recheck the broken links. At the time I had
a dial-up connection, I got good results with 70. Now I have a DSL connection,
and I have to set the number to 1-5. I suspect that my DSL provider has
installed a brake somewhere to prevent "commercial" customers from using
the unexpensive "private" service.
13. Can I have the source code?
Hahahahahaha!
14. Can I buy the source code?
Sure, make me "an offer I can't refuse".
15. Just for fun, I checked Tilman's web site, and found many broken links.
Why?
I check my own web site every week on friday. Nevertheless there are always
broken links:
-
Links that I know to be broken: I keep them like that to remind me to find
these people some day. The web page itself has a notice that the link is
broken.
-
Temporary unreachable hosts: these are temporary routing errors.
-
Really broken links: I will usually correct the link or remove it within
the next few days.
16. How do I correct broken links?
Repairing broken links (i.e. getting the correct ones) is a difficult task
that takes time, but with experience, you'll get it done faster and faster.
-
if you have the e-mail address of the site owner (because you know him),
try an e-mail. Sometimes the address still works, even if the web site
is gone.
-
find the home page of the site you link to, to see if the site has a "sorry
we moved" message. If you linked to http://www.host.com/~user/page888.html
and this is broken, look at http://www.host.com/~user/ to see
if there is a message, or to see if the site has been reorganized. Some
sites reorganize their user pages differently, e.g. http://www.host.com/homepages/users/page888.html.
Sometimes the web switches changes between the two methods. Other sites
are owned by the user himself, e.g. www.user.com, so the home
page is the root page. If the site exists but you cannot find your page,
send an e-mail to the owner.
-
use search engines to find the site or the name of the site owner (if you
know). To find where the site is, use web search engines (like Google
or the Internet Archive) and usenet
search engines (like Google Groups).
-
You find the site you searched for
-
You find a site that links to the site you searched for
-
You find the site in the Google Cache or the Internet Archive (simply enter
the URL in the search box!), and can use the contents to search for the
name of the owner
-
You find a site that links to the site you searched for, but is also broken.
E-mail the site owner, and tell him that the link is broken. Bookmark the
site and revisit it in a week, to see if the other person has found it.
If not, you have nevertheless succeeded in making the other person feel
as bad as you, which brings some relief :-)
-
You find the new e-mail address of the user. Either e-mail him, or try
to construct the URL yourself (user@host.com leads to http://www.host.com/~user/)
-
post a message in a newsgroup that deals with the topic. Hopefully the
site owner or one of his friends reads the messages there.
-
if you are still unsuccessful, either delete your link to the site or repeat
your attempts after a month (some sites might reappear in a search engine
after some time). Sometimes it happens that a host is reorganizing its
hard disk, and all user pages get back within a few days.
17. What about ftp and gopher sites?
Starting with version 1.0k I have implemented a new ftp checking method
that is 100% reliable. Sadly, this method does
not work with proxies. The previous method I used (and still use for
gopher) was unreliable, as it did not detect certain errors.
The method for checking gopher sites is still unreliable. When an ftp
or gopher site is accessed through a proxy, this proxy builds up a web
page. Sadly, it doesn't always bring up the information whether the URL
exists or not. When you access a gopher site without a proxy, it brings
an error message, but not an error code. This seems to be a bug
of the OpenURL() function of WININET.DLL.
The output lists ftp and gopher sites as links, which allows you to
make a manual check of these sites.
18. Why can't I launch URLs?
Starting with version 1.0g (Christmas 1997), URLs are launched with DDE
("dynamic data exchange", a windows method of communication between applications),
to open many browser windows but to prevent the opening of several Netscape
applications. This is done with the help of the Registry, by searching
for HKEY_CLASSES_ROOT\http\shell\open. This has the path for the
browser, the DDE application name (e.g. "Netscape", "IExplore"), the DDE topic (usually
"WWW_OpenURL"), and a template for the DDE item (usually "%1").
If you cannot launch an URL, do not panic - export and e-mail me the segment
of your registry (start REGEDIT.EXE, and search for "http").
The cause is usually that you have not installed your browser properly (maybe
you just transferred the files from another computer). Solution: update or reinstall your browser.
Starting with version 1.1b, I have stopped displaying an error message
when the registry is incomplete, because there were too many complaints.
Instead, the browser will simply be launched with the page. This has the
disadvantage that the page won't be displayed in an extra window of the
current active browser application.
18a. Why does the browser not open a new window?
This is a problem with Microsoft Internet Explorer. Open your registry and search for HKEY_CLASSES_ROOT\http\shell\open\ddeexec. If the key value is "%1",,-1,0,,,, then change it to "%1",,0,0,,,, (i.e. you change the -1 to 0).
18b. Why does Link Sleuth freeze when launching the report or an URL?
I do not know why this happens, but I have experienced this myself with
Windows ME (but not with Windows XP), and have received similar reports from users. The problem
goes away by rebooting Windows, but comes back later. You can also get
rid of the problem by making a change in the XENU.INI file: below
the line with [Options], enter this:
UseDDE=0
The only disadvantage is that it will not open a new window in the browser.
19. Why is LinkSleuth messing around with cookies?
Should not happen anymore now, because Xenu rejects all cookies.
Old explanation:
If you ask this, then you have configured your internet configuation
to be asked before submitting a cookie, and get constantly requests. But
sadly I am not responsible for this - it is a part of Microsoft's WININET.DLL.
According to Cookie Central,
there is not much you can do.
20. Why are some links reported as "broken" by Xenu, that can be displayed
within my browser?
Some servers read the "User Agent", i.e. the name of the software that
tries to access a website. Some websites are programmed only for Netscape
and Internet Explorer, and refuse everything else. Some may even specifically
refuse Xenu because of past misuse. A user-configurable "User Agent" would
be the solution, but this would make abuse possible.
21. Why can't I connect to "secure" (https) sites ?
If you have set your proxy correctly, try to connect
with IE. If this doesn't work, read
this usenet post for help. If this still doesn't work and you use Windows
NT 4.0, install the latest
NT service packs (up to SP5).
22. Any known problems with Windows 95?
Some people have reported crashes. These problems were usually solved by
installing IE 3.0 (or higher) or the following service packs:
One guy had problems with the WININET.DLL (v. 4.70.1300) installed with
OEM Windows 95 (v. 95 4.00.950 C). Changing to version 4.70.1335 solved
the problem.
A simpler solution is to go to http://windowsupdate.microsoft.com
and install whatever they tell you (you need to have IE 4.0 or higher on
your system)
23. Any known problems with Windows 2000?
Although I received many reports that it runs fine, one user reported a
problem and a solution:
Windows 2000 automatically sets a configuration option to use HTTP 1.1
for connecting to web sites. Many, many web sites do not use that version
but continue to use HTTP 1.0, so the automatic setting may prevent connections.
This is the reason why Xenu would not run for me. When I disabled that
setting, Xenu performed properly.
To disable that setting: Control Panel -> Internet Options -> Advanced
(tab) -> HTTP 1.1 settings (list heading) -> Use HTTP 1.1 (checkbox: uncheck
it)
24. Can I configure the timeout?
Enter the number of seconds in the [Options] segment in XENU.INI,
e.g. as timeout=120. The default value is 60. Note that this isn't
"perfect". Microsoft Windows has a bug
so that the timeout can't be set the way it should. I am using a workaround
advice from Microsoft. However I have observed that it doesn't work
if the timeout "hits" while trying to find out if a host name exists.
Alternatively, try this:
-
Start the Registry Editor (REGEDIT.EXE)
-
Go to HKEY_CURRENT_USER \ Software \ Microsoft \ Windows \ CurrentVersion
\ InternetSettings
-
Select New > DWORD from the Edit menu
-
Call it ReceiveTimeout with a value of <number of seconds>*1000
(The "hidden" default is 300000, i.e. five minutes, which is too long)
-
Restart your system
25. What about JavaScript?
The software does not check links generated by JavaScript, because JavaScript
is a programming language, not a formatting language. This makes web pages
dynamic; they e.g. depend on a mouse movement from minutes ago. While it
would probably be easy to check JS links like javascript:newWindow('../popup/glossary.html#xenu')
the problem is that not all JavaScript links are done this way. Many authors
supply their own newWindow() function. If you have an idea for an easy
solution, e-mail
me.
26. What about passwords entered in a FORM?
The software is not able to enter passwords in a FORM. I just don't see
a way to acomplish this easily. I assume it is possible if one combines
a set of variable names, values, and a web page that would accept them
with a POST command. I have not even taken the time to investigate how
others do it; if you have an idea for an easy solution, e-mail
me.
27. How about a WAP version?
Xenu does check .wml files since February 2001.
28. What about these error codes?
I identify only a subset of all possible error codes in the "Status" column.
If you get an unknown error code in the Xenu application window, you can
scroll to the right for an explanation text.
More information:
Bug List
The software works pretty well, but here the list of things that shouldn't
be.
-
the thread count is sometimes incorrect if the maximum is changed while
active
-
the thread count is sometimes incorrect at the end of the session
-
The </A> closing tag must not have spaces or newlines inside
-
<applet code="myclass.class" archive="jump.zip"> will produce
a broken link if myclass.class exists, but only in the archive
-
leftover TGH*.* files in the %TEMP% directory
-
weird effects when INI file >64K
If you find another bug, e-mail
me a description, please include the URL you are checking, and if
possible try to save your work in a .XEN file and attach it. Also check
http://windowsupdate.microsoft.com
to make sure that your system has all the updates. If you want to e-mail
a suggestion, click here. You can also join the user group by sending an e-mail to xenu-usergroup-subscribe@yahoogroups.com.
Future feature List
Things I will do in the future (maybe when hell freezes over!):
-
simple conversion from Unicode
-
ROBOTS.TXT
support
-
Detect remote
loading of images (geocities
sabotages this)
-
Custom views in Xenu Window
-
Solution for leftover TGH*.* files in temp directory
-
Command-line parameters (actually, this has already been done, for a client
who agreed to pay my development time to two people I support. If you need
something similar, e-mail me, the price is a $300 donation to be split
between two people I support)
-
Names of last checked URLs in also file menu
-
Server based Link Sleuthing, i.e. to be used as CGI application, so that
ISPs could offer link sleuthing to their own users, i.e. users could check
their own web sites.
If you are an ISP who is willing to offer this, contact me to work
out details.
-
Automatic saving every minute
-
A correctly working "Update" feature that rechecks changed sites (tricky,
so I will never do it)
-
Ideas from Chris:
-
What about identifying how many steps it takes to reach a particular page
from the home page and how much kb had to be downloaded before one could
reach there.
[TH: useful e.g. to which steps a user must take to reach the page
of a particular product]
-
Read RFCs
-
Your
suggestions: e-mail me also if there is something of the above you'd
like to have, and persuade me to do it. If you want to report a bug,
click
here.
The Story of Xenu's Link Sleuth(TM)
(for fellow software developers)
In April and May 1997 my employer assigned me on an out-of-town job, because
another department needed a guy with MFC experience. So from monday to
friday I was away, and on the evenings I was bored to death. Every week-end
I was back home, and I usually checked my web site for broken links with
WebAnalyzer.
Sadly the software had a lot of bugs, and their support was ignoring my
e-mails, and I was mad as hell, as I had spent quite a lot of money on
a product that wasn't worth it. My job was also the first contact with
VC++ 4.2 (previously I had only worked with VC++ 1.5, because our customers
have a lot of 16bit systems), which had some easy-to-use Internet access
classes. I had already experience with WINSOCK programming, but these classes
would spare me a lot of time evaluating HTTP result headers and other annoying
stuff. On an evening after an excellent italian food with a good chianti
I took some hotel letter paper and wrote down a concept for checking links.
A month later I took some time to install the development software on my
computer and started working, with the help of that hotel-room concept.
The work was done on some evenings, but mostly on week-ends, when I had
more time.
My philosophy on software development has always been "smaller,
simpler, cheaper", long before the NASA realized this (in May 2002 I was
told that the actual NASA philosophy was Faster, Better, Cheaper
- oops!) Because of that, I need no fancy (but totally useless) graphics
like in WebAnalyzer. Just results. And they'd better be 100% correct or
I'd have to kill myself :-)
The
application is written in Visual C++, and uses the MFC classes as much
as possible: CDocument, CView, CListView, CObArray, CMapStringToOb, CArchive,
CInternetSession, CHttpFile, etc, etc. That saved me a lot of time!
Credits
Icons in EXE file: Martin Hunt and Paul Campbell; Icon on web page: Erik
Plummer; Idea to use banners in report: Marc Cross; Xenu logo button: Fred
C.; second Xenu logo button: Charles A.
Upsdell; Volcano animated cursor: Juan
C. Pradas-Bergnes; Idea & help with SMTP integration: Mark Findlay;
SMTP
class: P.J. Naughter; Xenu artwork: William
C. Chenoweth; Help files: Andrew Schoenhofer.
Links for further reading
Trademarks
Xenu, Xenu's Link Sleuth and Link Sleuth are trademarks used
by Tilman Hausherr for software products and services. These products are
not associated in any way with services licensed by RTC, CoST, BPI, CSI,
etc.
![[Mozilla Open Directory Cool Site Award]](https://web.archive.org/web/20040806000000/http://dmoz.org/img/cool2.gif)
![[ZDNet 4 stars Editor's pick]](https://web.archive.org/web/20040806000000/http://web.archive.org/web/20001009043123/http://www.zdnet.com/downloads/images/red81.gif)
![[Nonags 6 best]](https://web.archive.org/web/20040806000000/http://nonags.com/nonags/imgs/6.gif)
![[Listsoft cool]](https://web.archive.org/web/20040806000000/http://www.listsoft.com/img/buttons/cool.gif)
![[Lockergnome]](https://web.archive.org/web/20040806000000/http://www.lockergnome.com/images/award-1.gif)
![[Completely free software, five doves award]](https://web.archive.org/web/20040806000000/http://www.completelyfreesoftware.com/cfs_award5.gif)
Home | $cientology
| Magic | Mozilla
| Tilman | Deutsch
| Bookstore
tilman@berlin.snafu.de